C++ II

Compound Data Types continued. 1

Dynamic Memory. 1

Data Structures. 4

Other Data Types. 10

Object oriented programming. 13

Classes I 13

Classes II 21

Friendship and inheritance. 25

Polymorphism.. 32

Advanced Concepts. 38

Templates. 38

Namespaces. 44

Exceptions. 47

Type Casting. 50

Preprocessor directives. 56

C++ Standard Library. 60

Input/Output with files. 60

 

Compound Data Types continued

Dynamic Memory

Until now, in all our programs, we have only had as much memory available as we declared for our variables, having the size of all of them to be determined in the source code, before the execution of the program. But, what if we need a variable amount of memory that can only be determined during runtime? For example, in the case that we need some user input to determine the necessary amount of memory space.

The answer is dynamic memory, for which C++ integrates the operators new and delete.

Operators new and new[]

In order to request dynamic memory we use the operator new. new is followed by a data type specifier and -if a sequence of more than one element is required- the number of these within brackets []. It returns a pointer to the beginning of the new block of memory allocated. Its form is:

pointer = new type
pointer = new type [elements]

The first expression is used to allocate memory to contain one single element of type type. The second one is used to assign a block (an array) of elements of type type, where elements is an integer value representing the number of these. For example:

      int * bobby;

      bobby = new int [5];

In this case, the system dynamically assigns space for five elements of type int and returns a pointer to the first element of the sequence, which is assigned to bobby. Therefore, now, bobby points to a valid block of memory with space for five elements of type int.

The first element pointed by bobby can be accessed either with the expression bobby[0] or the expression *bobby. Both are equivalent as has been explained in the section about pointers. The second element can be accessed either with bobby[1] or *(bobby+1) and so on...

You could be wondering the difference between declaring a normal array and assigning dynamic memory to a pointer, as we have just done. The most important difference is that the size of an array has to be a constant value, which limits its size to what we decide at the moment of designing the program, before its execution, whereas the dynamic memory allocation allows us to assign memory during the execution of the program (runtime) using any variable or constant value as its size.

The dynamic memory requested by our program is allocated by the system from the memory heap. However, computer memory is a limited resource, and it can be exhausted. Therefore, it is important to have some mechanism to check if our request to allocate memory was successful or not.

C++ provides two standard methods to check if the allocation was successful:

One is by handling exceptions. Using this method an exception of type bad_alloc is thrown when the allocation fails. Exceptions are a powerful C++ feature explained later in these tutorials. But for now you should know that if this exception is thrown and it is not handled by a specific handler, the program execution is terminated.

This exception method is the default method used by new, and is the one used in a declaration like:

      bobby = new int [5];  // if it fails an exception is thrown

The other method is known as nothrow, and what happens when it is used is that when a memory allocation fails, instead of throwing a bad_alloc exception or terminating the program, the pointer returned by new is a null pointer, and the program continues its execution.

This method can be specified by using a special object called nothrow as parameter for new:

      bobby = new (nothrow) int [5];

In this case, if the allocation of this block of memory failed, the failure could be detected by checking if bobby took a null pointer value:

      int * bobby;

      bobby = new (nothrow) int [5];

      if (bobby == 0) {

        // error assigning memory. Take measures.

      };

This nothrow method requires more work than the exception method, since the value returned has to be checked after each and every memory allocation, but I will use it in our examples due to its simplicity. Anyway this method can become tedious for larger projects, where the exception method is generally preferred. The exception method will be explained in detail later in this tutorial.

Operator delete and delete[]

Since the necessity of dynamic memory is usually limited to specific moments within a program, once it is no longer needed it should be freed so that the memory becomes available again for other requests of dynamic memory. This is the purpose of the operator delete, whose format is:

      delete pointer;

      delete [] pointer;

The first expression should be used to delete memory allocated for a single element, and the second one for memory allocated for arrays of elements.

The value passed as argument to delete must be either a pointer to a memory block previously allocated with new, or a null pointer (in the case of a null pointer, delete produces no effect).

      // rememb-o-matic

      #include <iostream>

      using namespace std;

     

      int main ()

      {

        int i,n;

        int * p;

        cout << "How many numbers would you like to type? ";

        cin >> i;

        p= new (nothrow) int[i];

        if (p == 0)

          cout << "Error: memory could not be allocated";

        else

        {

          for (n=0; n<i; n++)

          {

            cout << "Enter number: ";

            cin >> p[n];

          }

          cout << "You have entered: ";

          for (n=0; n<i; n++)

            cout << p[n] << ", ";

          delete[] p;

        }

        return 0;

      }

Output

      How many numbers would you like to type? 5

      Enter number : 75

      Enter number : 436

      Enter number : 1067

      Enter number : 8

      Enter number : 32

      You have entered: 75, 436, 1067, 8, 32,

Notice how the value within brackets in the new statement is a variable value entered by the user (i), not a constant value:

      p= new (nothrow) int[i];

But the user could have entered a value for i so big that our system could not handle it. For example, when I tried to give a value of 1 billion to the "How many numbers" question, my system could not allocate that much memory for the program and I got the text message we prepared for this case (Error: memory could not be allocated). Remember that in the case that we tried to allocate the memory without specifying the nothrow parameter in the new expression, an exception would be thrown, which if it's not handled terminates the program.

It is a good practice to always check if a dynamic memory block was successfully allocated. Therefore, if you use the nothrow method, you should always check the value of the pointer returned. Otherwise, use the exception method, even if you do not handle the exception. This way, the program will terminate at that point without causing the unexpected results of continuing executing a code that assumes a block of memory to have been allocated when in fact it has not.

Dynamic memory in ANSI-C

Operators new and delete are exclusive of C++. They are not available in the C language. But using pure C language, dynamic memory can also be used through the functions malloc, calloc, realloc and free, defined in the <cstdlib> header file, and since C++ is a superset of C, these functions are also available to C++ programmers.

The memory blocks allocated by these functions are not compatible with those returned by new, so each one should be manipulated with its own set of functions or operators.

Data Structures

 

We have already learned how groups of sequential data can be used in C++. But this is somewhat restrictive, since in many occasions what we want to store are not mere sequences of elements all of the same data type, but sets of different elements with different data types.

Data structures

A data structure is a group of data elements grouped together under one name. These data elements, known as members, can have different types and different lengths. Data structures are declared in C++ using the following syntax:

struct structure_name {
member_type1 member_name1;
member_type2 member_name2;
member_type3 member_name3;
.
.
} object_names;

where structure_name is a name for the structure type, object_name can be a set of valid identifiers for objects that have the type of this structure. Within braces { } there is a list with the data members, each one is specified with a type and a valid identifier as its name.

The first thing we have to know is that a data structure creates a new type: Once a data structure is declared, a new type with the identifier specified as structure_name is created and can be used in the rest of the program as if it was any other type. For example:

      struct product {
        int weight;
        float price;
      } ;
      
      product apple;
      product banana, melon;

We have first declared a structure type called product with two members: weight and price, each of a different fundamental type. We have then used this name of the structure type (product) to declare three objects of that type: apple, banana and melon as we would have done with any fundamental data type.

Once declared, product has become a new valid type name like the fundamental ones int, char or short and from that point on we are able to declare objects (variables) of this compound new type, like we have done with apple, banana and melon.

Right at the end of the struct declaration, and before the ending semicolon, we can use the optional field object_name to directly declare objects of the structure type. For example, we can also declare the structure objects apple, banana and melon at the moment we define the data structure type this way:

      struct product {
        int weight;
        float price;
      } apple, banana, melon;

It is important to clearly differentiate between what is the structure type name, and what is an object (variable) that has this structure type. We can instantiate many objects (i.e. variables, like apple, banana and melon) from a single structure type (product).

Once we have declared our three objects of a determined structure type (apple, banana and melon) we can operate directly with their members. To do that we use a dot (.) inserted between the object name and the member name. For example, we could operate with any of these elements as if they were standard variables of their respective types:

      apple.weight
      apple.price
      banana.weight
      banana.price
      melon.weight
      melon.price

Each one of these has the data type corresponding to the member they refer to: apple.weight, banana.weight and melon.weight are of type int, while apple.price, banana.price and melon.price are of type float.

Let's see a real example where you can see how a structure type can be used in the same way as fundamental types:

      // example about structures
      #include <iostream>
      #include <string>
      #include <sstream>
      using namespace std;
      
      struct movies_t {
        string title;
        int year;
      } mine, yours;
      
      void printmovie (movies_t movie);
      
      int main ()
      {
        string mystr;
      
        mine.title = "2001 A Space Odyssey";
        mine.year = 1968;
      
        cout << "Enter title: ";
        getline (cin,yours.title);
        cout << "Enter year: ";
        getline (cin,mystr);
        stringstream(mystr) >> yours.year;
      
        cout << "My favorite movie is:\n ";
        printmovie (mine);
        cout << "And yours is:\n ";
        printmovie (yours);
        return 0;
      }
      
      void printmovie (movies_t movie)
      {
        cout << movie.title;
        cout << " (" << movie.year << ")\n";
      }
Output
      Enter title: Alien
      Enter year: 1979
      
      My favorite movie is:
       2001 A Space Odyssey (1968)
      And yours is:
       Alien (1979)

The example shows how we can use the members of an object as regular variables. For example, the member yours.year is a valid variable of type int, and mine.title is a valid variable of type string.

The objects mine and yours can also be treated as valid variables of type movies_t, for example we have passed them to the function printmovie as we would have done with regular variables. Therefore, one of the most important advantages of data structures is that we can either refer to their members individually or to the entire structure as a block with only one identifier.

Data structures are a feature that can be used to represent databases, especially if we consider the possibility of building arrays of them:

      // array of structures
      #include <iostream>
      #include <string>
      #include <sstream>
      using namespace std;
      
      #define N_MOVIES 3
      
      struct movies_t {
        string title;
        int year;
      } films [N_MOVIES];
      
      void printmovie (movies_t movie);
      
      int main ()
      {
        string mystr;
        int n;
      
        for (n=0; n<N_MOVIES; n++)
        {
          cout << "Enter title: ";
          getline (cin,films[n].title);
          cout << "Enter year: ";
          getline (cin,mystr);
          stringstream(mystr) >> films[n].year;
        }
      
        cout << "\nYou have entered these movies:\n";
        for (n=0; n<N_MOVIES; n++)
          printmovie (films[n]);
        return 0;
      }
      
      void printmovie (movies_t movie)
      {
        cout << movie.title;
        cout << " (" << movie.year << ")\n";
      }

Output

      Enter title: Blade Runner
      Enter year: 1982
      Enter title: Matrix
      Enter year: 1999
      Enter title: Taxi Driver
      Enter year: 1976
       
      You have entered these movies:
      Blade Runner (1982)
      Matrix (1999)
      Taxi Driver (1976)

Pointers to structures

Like any other type, structures can be pointed by its own type of pointers:

      struct movies_t {
        string title;
        int year;
      };
      
      movies_t amovie;
      movies_t * pmovie;

Here amovie is an object of structure type movies_t, and pmovie is a pointer to point to objects of structure type movies_t. So, the following code would also be valid:

      pmovie = &amovie;

The value of the pointer pmovie would be assigned to a reference to the object amovie (its memory address).

We will now go with another example that includes pointers, which will serve to introduce a new operator: the arrow operator (->):

      // pointers to structures
      #include <iostream>
      #include <string>
      #include <sstream>
      using namespace std;
      
      struct movies_t {
        string title;
        int year;
      };
      
      int main ()
      {
        string mystr;
      
        movies_t amovie;
        movies_t * pmovie;
        pmovie = &amovie;
      
        cout << "Enter title: ";
        getline (cin, pmovie->title);
        cout << "Enter year: ";
        getline (cin, mystr);
        (stringstream) mystr >> pmovie->year;
      
        cout << "\nYou have entered:\n";
        cout << pmovie->title;
        cout << " (" << pmovie->year << ")\n";
      
        return 0;
      }
Output
      Enter title: Invasion of the body snatchers
      Enter year: 1978
       
      You have entered:
      Invasion of the body snatchers (1978)

The previous code includes an important introduction: the arrow operator (->). This is a dereference operator that is used exclusively with pointers to objects with members. This operator serves to access a member of an object to which we have a reference. In the example we used:

      pmovie->title

Which is for all purposes equivalent to:

      (*pmovie).title

Both expressions pmovie->title and (*pmovie).title are valid and both mean that we are evaluating the member title of the data structure pointed by a pointer called pmovie. It must be clearly differentiated from:

      *pmovie.title

which is equivalent to:

      *(pmovie.title)

And that would access the value pointed by a hypothetical pointer member called title. The following panel summarizes possible combinations of pointers and structure members:

Expression

What is evaluated

Equivalent

a.b

Member b of object a

 

a->b

Member b of object pointed by a

(*a).b

*a.b

Value pointed by member b of object a

*(a.b)

Nesting structures

Structures can also be nested so that a valid element of a structure can also be on its turn another structure.

      struct movies_t {
        string title;
        int year;
      };
      
      struct friends_t {
        string name;
        string email;
        movies_t favorite_movie;
        } charlie, maria;
      
      friends_t * pfriends = &charlie;

After the previous declaration we could use any of the following expressions:

      charlie.name
      maria.favorite_movie.title
      charlie.favorite_movie.year
      pfriends->favorite_movie.year

(where, by the way, the last two expressions refer to the same member).

 

Other Data Types

Defined data types (typedef)

C++ allows the definition of our own types based on other existing data types. We can do this using the keyword typedef, whose format is:

typedef existing_type new_type_name ;

where existing_type is a C++ fundamental or compound type and new_type_name is the name for the new type we are defining. For example:

      typedef char C;
      typedef unsigned int WORD;
      typedef char * pChar;
      typedef char field [50]; 

In this case we have defined four data types: C, WORD, pChar and field as char, unsigned int, char* and char[50] respectively, that we could perfectly use in declarations later as any other valid type:

      C mychar, anotherchar, *ptc1;
      WORD myword;
      pChar ptc2;
      field name; 

typedef does not create different types. It only creates synonyms of existing types. That means that the type of myword can be considered to be either WORD or unsigned int, since both are in fact the same type.

typedef can be useful to define an alias for a type that is frequently used within a program. It is also useful to define types when it is possible that we will need to change the type in later versions of our program, or if a type you want to use has a name that is too long or confusing.

Unions

Unions allow one same portion of memory to be accessed as different data types, since all of them are in fact the same location in memory. Its declaration and use is similar to the one of structures but its functionality is totally different:

 

union union_name {
      member_type1 member_name1;
      member_type2 member_name2;
      member_type3 member_name3;
      .
      .
      } object_names;

All the elements of the union declaration occupy the same physical space in memory. Its size is the one of the greatest element of the declaration. For example:

      union mytypes_t {
        char c;
        int i;
        float f;
        } mytypes;

defines three elements:

      mytypes.c
      mytypes.i
      mytypes.f

each one with a different data type. Since all of them are referring to the same location in memory, the modification of one of the elements will affect the value of all of them. We cannot store different values in them independent from each other.

One of the uses a union may have is to unite an elementary type with an array or structures of smaller elements. For example:

      union mix_t {
        long l;
        struct {
          short hi;
          short lo;
          } s;
        char c[4];
      } mix;

defines three names that allow to access the same group of 4 bytes: mix.l, mix.s and mix.c and which we can use according to how we want to access these bytes, as if they were a single long-type data, as if they were two short elements or as an array of char elements, respectively. I have mixed types, arrays and structures in the union so that you can see the different ways that we can access the data. For a little-endian system (most PC platforms), this union could be represented as:

The exact alignment and order of the members of a union in memory is platform dependant. Therefore be aware of possible portability issues with this type of use.

Anonymous unions

In C++ we have the option to declare anonymous unions. If we declare a union without any name, the union will be anonymous and we will be able to access its members directly by their member names. For example, look at the difference between these two structure declarations:

structure with regular union

structure with anonymous union

struct {
  char title[50];
  char author[50];
  union {
    float dollars;
    int yens;
  } price;
} book;
struct {
  char title[50];
  char author[50];
  union {
    float dollars;
    int yens;
  };
} book;

The only difference between the two pieces of code is that in the first one we have given a name to the union (price) and in the second one we have not. The difference is seen when we access the members dollars and yens of an object of this type. For an object of the first type, it would be:

      book.price.dollars
      book.price.yens

whereas for an object of the second type, it would be:

      book.dollars
      book.yens

Once again I remind you that because it is a union and not a struct, the members dollars and yens occupy the same physical space in the memory so they cannot be used to store two different values simultaneously. You can set a value for price in dollars or in yens, but not in both.

Enumerations (enum)

Enumerations create new data types to contain something different that is not limited to the values fundamental data types may take. Its form is the following:

enum enumeration_name {
  value1,
  value2,
  value3,
  .
  .
} object_names;

For example, we could create a new type of variable called color to store colors with the following declaration:

      enum colors_t {black, blue, green, cyan, red, purple, yellow, white}; 

Notice that we do not include any fundamental data type in the declaration. To say it somehow, we have created a whole new data type from scratch without basing it on any other existing type. The possible values that variables of this new type color_t may take are the new constant values included within braces. For example, once the colors_t enumeration is declared the following expressions will be valid:

      colors_t mycolor;
       
      mycolor = blue;
      if (mycolor == green) mycolor = red; 

Enumerations are type compatible with numeric variables, so their constants are always assigned an integer numerical value internally. If it is not specified, the integer value equivalent to the first possible value is equivalent to 0 and the following ones follow a +1 progression. Thus, in our data type colors_t that we have defined above, black would be equivalent to 0, blue would be equivalent to 1, green to 2, and so on.

We can explicitly specify an integer value for any of the constant values that our enumerated type can take. If the constant value that follows it is not given an integer value, it is automatically assumed the same value as the previous one plus one. For example:

      enum months_t { january=1, february, march, april,
                      may, june, july, august,
                      september, october, november, december} y2k;

In this case, variable y2k of enumerated type months_t can contain any of the 12 possible values that go from january to december and that are equivalent to values between 1 and 12 (not between 0 and 11, since we have made january equal to 1).

Object oriented programming

Classes I

A class is an expanded concept of a data structure: instead of holding only data, it can hold both data and functions.

An object is an instantiation of a class. In terms of variables, a class would be the type, and an object would be the variable.

Classes are generally declared using the keyword class, with the following format:

      lass class_name {
       access_specifier_1:
         member1;
       access_specifier_2:
         member2;
       ...
       } object_names;

Where class_name is a valid identifier for the class, object_names is an optional list of names for objects of this class. The body of the declaration can contain members, that can be either data or function declarations, and optionally access specifiers.

All is very similar to the declaration on data structures, except that we can now include also functions and members, but also this new thing called access specifier. An access specifier is one of the following three keywords: private, public or protected. These specifiers modify the access rights that the members following them acquire:

By default, all members of a class declared with the class keyword have private access for all its members. Therefore, any member that is declared before one other class specifier automatically has private access. For example:

      class CRectangle {
          int x, y;
        public:
          void set_values (int,int);
          int area (void);
        } rect;

Declares a class (i.e., a type) called CRectangle and an object (i.e., a variable) of this class called rect. This class contains four members: two data members of type int (member x and member y) with private access (because private is the default access level) and two member functions with public access: set_values() and area(), of which for now we have only included their declaration, not their definition.

Notice the difference between the class name and the object name: In the previous example, CRectangle was the class name (i.e., the type), whereas rect was an object of type CRectangle. It is the same relationship int and a have in the following declaration:

      int a;

where int is the type name (the class) and a is the variable name (the object).

After the previous declarations of CRectangle and rect, we can refer within the body of the program to any of the public members of the object rect as if they were normal functions or normal variables, just by putting the object's name followed by a dot (.) and then the name of the member. All very similar to what we did with plain data structures before. For example:

      rect.set_values (3,4);
      myarea = rect.area(); 

The only members of rect that we cannot access from the body of our program outside the class are x and y, since they have private access and they can only be referred from within other members of that same class.

Here is the complete example of class CRectangle:

      // classes example
      #include <iostream>
      using namespace std;
      
      class CRectangle {
          int x, y;
        public:
          void set_values (int,int);
          int area () {return (x*y);}
      };
      
      void CRectangle::set_values (int a, int b) {
        x = a;
        y = b;
      }
      
      int main () {
        CRectangle rect;
        rect.set_values (3,4);
        cout << "area: " << rect.area();
        return 0;
      }
Output
      area: 12

The most important new thing in this code is the operator of scope (::, two colons) included in the definition of set_values(). It is used to define a member of a class from outside the class declaration itself.

You may notice that the definition of the member function area() has been included directly within the definition of the CRectangle class given its extreme simplicity, whereas set_values() has only its prototype declared within the class, but its definition is outside it. In this outside declaration, we must use the operator of scope (::) to specify that we are defining a function that is a member of the class CRectangle and not a regular global function.

The scope operator (::) specifies the class to which the member being declared belongs, granting exactly the same scope properties as if this function definition was directly included within the class definition. For example, in the function set_values() of the previous code, we have been able to use the variables x and y, which are private members of class CRectangle, which means they are only accessible from other members of their class.

The only difference between defining a class member function completely within its class and to include only the prototype and later its definition, is that in the first case the function will automatically be considered an inline member function by the compiler, while in the second it will be a normal (not-inline) class member function, which in fact supposes no difference in behavior.

Members x and y have private access (remember that if nothing else is said, all members of a class defined with keyword class have private access). By declaring them private we deny access to them from anywhere outside the class. This makes sense, since we have already defined a member function to set values for those members within the object: the member function set_values(). Therefore, the rest of the program does not need to have direct access to them. Perhaps in a so simple example as this, it is difficult to see an utility in protecting those two variables, but in greater projects it may be very important that values cannot be modified in an unexpected way (unexpected from the point of view of the object).

One of the greater advantages of a class is that, as any other type, we can declare several objects of it. For example, following with the previous example of class CRectangle, we could have declared the object rectb in addition to the object rect:

      // example: one class, two objects
      #include <iostream>
      using namespace std;
      
      class CRectangle {
          int x, y;
        public:
          void set_values (int,int);
          int area () {return (x*y);}
      };
      
      void CRectangle::set_values (int a, int b) {
        x = a;
        y = b;
      }
      
      int main () {
        CRectangle rect, rectb;
        rect.set_values (3,4);
        rectb.set_values (5,6);
        cout << "rect area: " << rect.area() << endl;
        cout << "rectb area: " << rectb.area() << endl;
        return 0;
      }
Output
      rect area: 12
      rectb area: 30  

In this concrete case, the class (type of the objects) to which we are talking about is CRectangle, of which there are two instances or objects: rect and rectb. Each one of them has its own member variables and member functions.

Notice that the call to rect.area() does not give the same result as the call to rectb.area(). This is because each object of class CRectangle has its own variables x and y, as they, in some way, have also their own function members set_value() and area() that each uses its object's own variables to operate.

That is the basic concept of object-oriented programming: Data and functions are both members of the object. We no longer use sets of global variables that we pass from one function to another as parameters, but instead we handle objects that have their own data and functions embedded as members. Notice that we have not had to give any parameters in any of the calls to rect.area or rectb.area. Those member functions directly used the data members of their respective objects rect and rectb.

Constructors and destructors

Objects generally need to initialize variables or assign dynamic memory during their process of creation to become operative and to avoid returning unexpected values during their execution. For example, what would happen if in the previous example we called the member function area() before having called function set_values()? Probably we would have gotten an undetermined result since the members x and y would have never been assigned a value.

In order to avoid that, a class can include a special function called constructor, which is automatically called whenever a new object of this class is created. This constructor function must have the same name as the class, and cannot have any return type; not even void.

We are going to implement CRectangle including a constructor:

      // example: class constructor
      #include <iostream>
      using namespace std;
      
      class CRectangle {
          int width, height;
        public:
          CRectangle (int,int);
          int area () {return (width*height);}
      };
      
      CRectangle::CRectangle (int a, int b) {
        width = a;
        height = b;
      }
      
      int main () {
        CRectangle rect (3,4);
        CRectangle rectb (5,6);
        cout << "rect area: " << rect.area() << endl;
        cout << "rectb area: " << rectb.area() << endl;
        return 0;
      }
Output
      rect area: 12
      rectb area: 30  

As you can see, the result of this example is identical to the previous one. But now we have removed the member function set_values(), and have included instead a constructor that performs a similar action: it initializes the values of x and y with the parameters that are passed to it.

Notice how these arguments are passed to the constructor at the moment at which the objects of this class are created:

      CRectangle rect (3,4);
      CRectangle rectb (5,6);

Constructors cannot be called explicitly as if they were regular member functions. They are only executed when a new object of that class is created.

You can also see how neither the prototype nor the later constructor declaration includes a return value; not even void.

The destructor fulfills the opposite functionality. It is automatically called when an object is destroyed, either because its scope of existence has finished (for example, if it was defined as a local object within a function and the function ends) or because it is an object dynamically assigned and it is released using the operator delete.

The destructor must have the same name as the class, but preceded with a tilde sign (~) and it must also return no value.

The use of destructors is especially suitable when an object assigns dynamic memory during its lifetime and at the moment of being destroyed we want to release the memory that the object was allocated.

      // example on constructors and destructors
      #include <iostream>
      using namespace std;
      
      class CRectangle {
          int *width, *height;
        public:
          CRectangle (int,int);
          ~CRectangle ();
          int area () {return (*width * *height);}
      };
      
      CRectangle::CRectangle (int a, int b) {
        width = new int;
        height = new int;
        *width = a;
        *height = b;
      }
      
      CRectangle::~CRectangle () {
        delete width;
        delete height;
      }
      
      int main () {
        CRectangle rect (3,4), rectb (5,6);
        cout << "rect area: " << rect.area() << endl;
        cout << "rectb area: " << rectb.area() << endl;
        return 0;
      }
Output
      rect area: 12
      rectb area: 30  

Overloading Constructors

Like any other function, a constructor can also be overloaded with more than one function that have the same name but different types or number of parameters. Remember that for overloaded functions the compiler will call the one whose parameters match the arguments used in the function call. In the case of constructors, which are automatically called when an object is created, the one executed is the one that matches the arguments passed on the object declaration:

      // overloading class constructors
      #include <iostream>
      using namespace std;
      
      class CRectangle {
          int width, height;
        public:
          CRectangle ();
          CRectangle (int,int);
          int area (void) {return (width*height);}
      };
      
      CRectangle::CRectangle () {
        width = 5;
        height = 5;
      }
      
      CRectangle::CRectangle (int a, int b) {
        width = a;
        height = b;
      }
      
      int main () {
        CRectangle rect (3,4);
        CRectangle rectb;
        cout << "rect area: " << rect.area() << endl;
        cout << "rectb area: " << rectb.area() << endl;
        return 0;
      }
Output
      rect area: 12
      rectb area: 25  

In this case, rectb was declared without any arguments, so it has been initialized with the constructor that has no parameters, which initializes both width and height with a value of 5.

Important: Notice how if we declare a new object and we want to use its default constructor (the one without parameters), we do not include parentheses ():

      CRectangle rectb;   // right
      CRectangle rectb(); // wrong!

Default constructor

If you do not declare any constructors in a class definition, the compiler assumes the class to have a default constructor with no arguments. Therefore, after declaring a class like this one:

      class CExample {
        public:
          int a,b,c;
          void multiply (int n, int m) { a=n; b=m; c=a*b; };
        };

The compiler assumes that CExample has a default constructor, so you can declare objects of this class by simply declaring them without any arguments:

      CExample ex;

But as soon as you declare your own constructor for a class, the compiler no longer provides an implicit default constructor. So you have to declare all objects of that class according to the constructor prototypes you defined for the class:

      class CExample {
        public:
          int a,b,c;
          CExample (int n, int m) { a=n; b=m; };
          void multiply () { c=a*b; };
        };

Here we have declared a constructor that takes two parameters of type int. Therefore the following object declaration would be correct:

      CExample ex (2,3);

But,

      CExample ex;

Would not be correct, since we have declared the class to have an explicit constructor, thus replacing the default constructor.

But the compiler not only creates a default constructor for you if you do not specify your own. It provides three special member functions in total that are implicitly declared if you do not declare your own. These are the copy constructor, the copy assignment operator, and the default destructor.

The copy constructor and the copy assignment operator copy all the data contained in another object to the data members of the current object. For CExample, the copy constructor implicitly declared by the compiler would be something similar to:

      CExample::CExample (const CExample& rv) {
        a=rv.a;  b=rv.b;  c=rv.c;
        }

Therefore, the two following object declarations would be correct:

      CExample ex (2,3);
      CExample ex2 (ex);   // copy constructor (data copied from ex)

Pointers to classes

It is perfectly valid to create pointers that point to classes. We simply have to consider that once declared, a class becomes a valid type, so we can use the class name as the type for the pointer. For example:

      CRectangle * prect;

is a pointer to an object of class CRectangle.

As it happened with data structures, in order to refer directly to a member of an object pointed by a pointer we can use the arrow operator (->) of indirection. Here is an example with some possible combinations:

      // pointer to classes example
      #include <iostream>
      using namespace std;
      
      class CRectangle {
          int width, height;
        public:
          void set_values (int, int);
          int area (void) {return (width * height);}
      };
      
      void CRectangle::set_values (int a, int b) {
        width = a;
        height = b;
      }
      
      int main () {
        CRectangle a, *b, *c;
        CRectangle * d = new CRectangle[2];
        b= new CRectangle;
        c= &a;
        a.set_values (1,2);
        b->set_values (3,4);
        d->set_values (5,6);
        d[1].set_values (7,8);
        cout << "a area: " << a.area() << endl;
        cout << "*b area: " << b->area() << endl;
        cout << "*c area: " << c->area() << endl;
        cout << "d[0] area: " << d[0].area() << endl;
        cout << "d[1] area: " << d[1].area() << endl;
        return 0;
      }
Output
      a area: 2
      *b area: 12
      *c area: 2
      d[0] area: 30
      d[1] area: 56

Next you have a summary on how can you read some pointer and class operators (*, &, ., ->, [ ]) that appear in the previous example:

expression

can be read as

*x

pointed by x

&x

address of x

x.y

member y of object x

x->y

member y of object pointed by x

(*x).y

member y of object pointed by x (equivalent to the previous one)

x[0]

first object pointed by x

x[1]

second object pointed by x

x[n]

(n+1)th object pointed by x

Be sure that you understand the logic under all of these expressions before proceeding with the next sections. If you have doubts, read again this section and/or consult the previous sections about pointers and data structures.

Classes defined with struct and union

Classes can be defined not only with keyword class, but also with keywords struct and union.

The concepts of class and data structure are so similar that both keywords have in C++ the exact same functionality except that the members of classes declared with keyword struct have public access by default, instead of private access, as classes declared with keyword class have. That is the only difference. For all other purposes both keywords are equivalent.

The concept under unions is different from than of struct and class, since unions only store one data member at a time, but they are also classes and thus can also hold function members. The default access in union classes is public.

 

Classes II

Overloading operators

C++ incorporates the option to use standard operators to perform operations with classes in addition to with fundamental types. For example:

      int a, b, c;
      a = b + c;

This is obviously valid code in C++, since the different variables of the addition are all fundamental types. Nevertheless, it is not so obvious that we could perform an operation similar to the following one:

      struct {
        string product;
        float price;
      } a, b, c;
      a = b + c; 

In fact, this will cause a compilation error, since we have not defined the behavior our class should have with addition operations. However, thanks to the C++ feature to overload operators, we can design classes able to perform operations using standard operators. Here is a list of all the operators that can be overloaded:

Overloadable operators

+    -    *    /    =    <    >    +=   -=   *=   /=   <<   >>
<<=  >>=  ==   !=   <=   >=   ++   --   %    &    ^    !    |
~    &=   ^=   |=   &&   ||   %=   []   ()   ,    ->*  ->   new 
delete    new[]     delete[]

To overload an operator in order to use it with classes we declare operator functions, which are regular functions whose names are the operator keyword followed by the operator sign that we want to overload. The format is:

type operator sign (parameters) { /*...*/ }

Here you have an example that overloads the addition operator (+). We are going to create a class to store bidimensional vectors and then we are going to add two of them: a(3,1) and b(1,2). The addition of two bidimensional vectors is an operation as simple as adding the two x coordinates to obtain the resulting x coordinate and adding the two y coordinates to obtain the resulting y. In this case the result will be (3+1,1+2) = (4,3).

      // vectors: overloading operators example
      #include <iostream>
      using namespace std;
      
      class CVector {
        public:
          int x,y;
          CVector () {};
          CVector (int,int);
          CVector operator + (CVector);
      };
      
      CVector::CVector (int a, int b) {
        x = a;
        y = b;
      }
      
      CVector CVector::operator+ (CVector param) {
        CVector temp;
        temp.x = x + param.x;
        temp.y = y + param.y;
        return (temp);
      }
      
      int main () {
        CVector a (3,1);
        CVector b (1,2);
        CVector c;
        c = a + b;
        cout << c.x << "," << c.y;
        return 0;
      }
Output
      4,3

It may be a little confusing to see so many times the CVector identifier. But, consider that some of them refer to the class name (type) CVector and some others are functions with that name (constructors must have the same name as the class). Do not confuse them:

      CVector (int, int);            // function name CVector
                                          //(constructor)
      CVector operator+ (CVector);   // function returns a CVector

The function operator+ of class CVector is the one that is in charge of overloading the addition operator (+). This function can be called either implicitly using the operator, or explicitly using the function name:

      c = a + b;
      c = a.operator+ (b);

Both expressions are equivalent.

Notice also that we have included the empty constructor (without parameters) and we have defined it with an empty block:

      CVector () { };

This is necessary, since we have explicitly declared another constructor:

      CVector (int, int);

And when we explicitly declare any constructor, with any number of parameters, the default constructor with no parameters that the compiler can declare automatically is not declared, so we need to declare it ourselves in order to be able to construct objects of this type without parameters. Otherwise, the declaration:

      CVector c;

included in main() would not have been valid.

Anyway, I have to warn you that an empty block is a bad implementation for a constructor, since it does not fulfill the minimum functionality that is generally expected from a constructor, which is the initialization of all the member variables in its class. In our case this constructor leaves the variables x and y undefined. Therefore, a more advisable definition would have been something similar to this:

      CVector () { x=0; y=0; };

which in order to simplify and show only the point of the code I have not included in the example.

As well as a class includes a default constructor and a copy constructor even if they are not declared, it also includes a default definition for the assignation operator (=) with the class itself as parameter. The behavior which is defined by default is to copy the whole content of the data members of the object passed as argument (the one at the right side of the sign) to the one at the left side:

      CVector d (2,3);
      CVector e;
      e = d;           // copy assignment operator

The copy assignment operator function is the only operator member function implemented by default. Of course, you can redefine it to any other functionality that you want, like for example, copy only certain class members or perform additional initialization procedures.

The overload of operators does not force its operation to bear a relation to the mathematical or usual meaning of the operator, although it is recommended. For example, the code may not be very intuitive if you use operator + to subtract two classes or operator== to fill with zeros a class, although it is perfectly possible to do so.

Although the prototype of a function operator+ can seem obvious since it takes what is at the right side of the operator as the parameter for the operator member function of the object at its left side, other operators may not be so obvious. Here you have a table with a summary on how the different operator functions have to be declared (replace @ by the operator in each case):

Expression

Operator

Member function

Global function

@a

+ - * & ! ~ ++ --

A::operator@()

operator@(A)

a@

++ --

A::operator@(int)

operator@(A,int)

a@b

+ - * / % ^ & | < > == != <= >= << >> && || ,

A::operator@ (B)

operator@(A,B)

a@b

= += -= *= /= %= ^= &= |= <<= >>= []

A::operator@ (B)

-

a(b, c...)

()

A::operator() (B, C...)

-

a->x

->

A::operator->()

-

Where a is an object of class A, b is an object of class B and c is an object of class C.

You can see in this panel that there are two ways to overload some class operators: as a member function and as a global function. Its use is indistinct, nevertheless I remind you that functions that are not members of a class cannot access the private or protected members of that class unless the global function is its friend (friendship is explained later).

The keyword this

The keyword this represents a pointer to the object whose member function is being executed. It is a pointer to the object itself.

One of its uses can to check if a parameter passed to a member function is the object itself. For example,

      // this
      #include <iostream>
      using namespace std;
      
      class CDummy {
        public:
          int isitme (CDummy& param);
      };
      
      int CDummy::isitme (CDummy& param)
      {
        if (&param == this) return true;
        else return false;
      }
      
      int main () {
        CDummy a;
        CDummy* b = &a;
        if ( b->isitme(a) )
          cout << "yes, &a is b";
        return 0;
      }
Output
      yes, &a is b

It is also frequently used in operator= member functions that return objects by reference (avoiding the use of temporary objects). Following with the vector's examples seen before we could have written an operator= function similar to this one:

      CVector& CVector::operator= (const CVector& param)
      {
        x=param.x;
        y=param.y;
        return *this;
      }

In fact this function is very similar to the code that the compiler generates implicitly for this class if we do not include an operator= member function to copy objects of this class.

Static members

A class can contain static members, either data or functions.

Static data members of a class are also known as "class variables", because there is only one unique value for all the objects of that same class. Their content is not different from one object of this class to another.

For example, it may be used for a variable within a class that can contain a counter with the number of objects of that class that have been created, as in the following example:

      // static members in classes
      #include <iostream>
      using namespace std;
      
      class CDummy {
        public:
          static int n;
          CDummy () { n++; };
          ~CDummy () { n--; };
      };
      
      int CDummy::n=0;
      
      int main () {
        CDummy a;
        CDummy b[5];
        CDummy * c = new CDummy;
        cout << a.n << endl;
        delete c;
        cout << CDummy::n << endl;
        return 0;
      }
Output
      7
      6

In fact, static members have the same properties as global variables but they enjoy class scope. For that reason, and to avoid them to be declared several times, we can only include the prototype (its declaration) in the class declaration but not its definition (its initialization). In order to initialize a static data-member we must include a formal definition outside the class, in the global scope, as in the previous example:

      int CDummy::n=0;

Because it is a unique variable value for all the objects of the same class, it can be referred to as a member of any object of that class or even directly by the class name (of course this is only valid for static members):

      cout << a.n;
      cout << CDummy::n; 

These two calls included in the previous example are referring to the same variable: the static variable n within class CDummy shared by all objects of this class.

Once again, I remind you that in fact it is a global variable. The only difference is its name and possible access restrictions outside its class.

Just as we may include static data within a class, we can also include static functions. They represent the same: they are global functions that are called as if they were object members of a given class. They can only refer to static data, in no case to non-static members of the class, as well as they do not allow the use of the keyword this, since it makes reference to an object pointer and these functions in fact are not members of any object but direct members of the class.

 

Friendship and inheritance

Friend functions

In principle, private and protected members of a class cannot be accessed from outside the same class in which they are declared. However, this rule does not affect friends.

Friends are functions or classes declared as such.

If we want to declare an external function as friend of a class, thus allowing this function to have access to the private and protected members of this class, we do it by declaring a prototype of this external function within the class, and preceding it with the keyword friend:

      // friend functions
      #include <iostream>
      using namespace std;
      
      class CRectangle {
          int width, height;
        public:
          void set_values (int, int);
          int area () {return (width * height);}
          friend CRectangle duplicate (CRectangle);
      };
      
      void CRectangle::set_values (int a, int b) {
        width = a;
        height = b;
      }
      
      CRectangle duplicate (CRectangle rectparam)
      {
        CRectangle rectres;
        rectres.width = rectparam.width*2;
        rectres.height = rectparam.height*2;
        return (rectres);
      }
      
      int main () {
        CRectangle rect, rectb;
        rect.set_values (2,3);
        rectb = duplicate (rect);
        cout << rectb.area();
        return 0;
      }
Output
      24 

The duplicate function is a friend of CRectangle. From within that function we have been able to access the members width and height of different objects of type CRectangle, which are private members. Notice that neither in the declaration of duplicate() nor in its later use in main() have we considered duplicate a member of class CRectangle. It isn't! It simply has access to its private and protected members without being a member.

The friend functions can serve, for example, to conduct operations between two different classes. Generally, the use of friend functions is out of an object-oriented programming methodology, so whenever possible it is better to use members of the same class to perform operations with them. Such as in the previous example, it would have been shorter to integrate duplicate() within the class CRectangle.

Friend classes

Just as we have the possibility to define a friend function, we can also define a class as friend of another one, granting that second class access to the protected and private members of the first one.

      // friend class
      #include <iostream>
      using namespace std;
      
      class CSquare;
      
      class CRectangle {
          int width, height;
        public:
          int area ()
            {return (width * height);}
          void convert (CSquare a);
      };
      
      class CSquare {
        private:
          int side;
        public:
          void set_side (int a)
            {side=a;}
          friend class CRectangle;
      };
      
      void CRectangle::convert (CSquare a) {
        width = a.side;
        height = a.side;
      }
        
      int main () {
        CSquare sqr;
        CRectangle rect;
        sqr.set_side(4);
        rect.convert(sqr);
        cout << rect.area();
        return 0;
      }
Output
      16

In this example, we have declared CRectangle as a friend of CSquare so that CRectangle member functions could have access to the protected and private members of CSquare, more concretely to CSquare::side, which describes the side width of the square.

You may also see something new at the beginning of the program: an empty declaration of class CSquare. This is necessary because within the declaration of CRectangle we refer to CSquare (as a parameter in convert()). The definition of CSquare is included later, so if we did not include a previous empty declaration for CSquare this class would not be visible from within the definition of CRectangle.

Consider that friendships are not corresponded if we do not explicitly specify so. In our example, CRectangle is considered as a friend class by CSquare, but CRectangle does not consider CSquare to be a friend, so CRectangle can access the protected and private members of CSquare but not the reverse way. Of course, we could have declared also CSquare as friend of CRectangle if we wanted to.

Another property of friendships is that they are not transitive: The friend of a friend is not considered to be a friend unless explicitly specified.

Inheritance between classes

A key feature of C++ classes is inheritance. Inheritance allows to create classes which are derived from other classes, so that they automatically include some of its "parent's" members, plus its own. For example, we are going to suppose that we want to declare a series of classes that describe polygons like our CRectangle, or like CTriangle. They have certain common properties, such as both can be described by means of only two sides: height and base.

This could be represented in the world of classes with a class CPolygon from which we would derive the two other ones: CRectangle and CTriangle.

The class CPolygon would contain members that are common for both types of polygon. In our case: width and height. And CRectangle and CTriangle would be its derived classes, with specific features that are different from one type of polygon to the other.

Classes that are derived from others inherit all the accessible members of the base class. That means that if a base class includes a member A and we derive it to another class with another member called B, the derived class will contain both members A and B.

In order to derive a class from another, we use a colon (:) in the declaration of the derived class using the following format:

class derived_class_name: public base_class_name
{ /*...*/ };

Where derived_class_name is the name of the derived class and base_class_name is the name of the class on which it is based. The public access specifier may be replaced by any one of the other access specifiers protected and private. This access specifier describes the minimum access level for the members that are inherited from the base class.

      // derived classes
      #include <iostream>
      using namespace std;
      
      class CPolygon {
        protected:
          int width, height;
        public:
          void set_values (int a, int b)
            { width=a; height=b;}
        };
      
      class CRectangle: public CPolygon {
        public:
          int area ()
            { return (width * height); }
        };
      
      class CTriangle: public CPolygon {
        public:
          int area ()
            { return (width * height / 2); }
        };
        
      int main () {
        CRectangle rect;
        CTriangle trgl;
        rect.set_values (4,5);
        trgl.set_values (4,5);
        cout << rect.area() << endl;
        cout << trgl.area() << endl;
        return 0;
      }
Output
      20
      10

The objects of the classes CRectangle and CTriangle each contain members inherited from CPolygon. These are: width, height and set_values().

The protected access specifier is similar to private. Its only difference occurs in fact with inheritance. When a class inherits from another one, the members of the derived class can access the protected members inherited from the base class, but not its private members.

Since we wanted width and height to be accessible from members of the derived classes CRectangle and CTriangle and not only by members of CPolygon, we have used protected access instead of private.

We can summarize the different access types according to who can access them in the following way:

Access

public

protected

private

members of the same class

yes

yes

yes

members of derived classes

yes

yes

no

not members

yes

no

no

Where "not members" represent any access from outside the class, such as from main(), from another class or from a function.

In our example, the members inherited by CRectangle and CTriangle have the same access permissions as they had in their base class CPolygon:

      CPolygon::width           // protected access
      CRectangle::width         // protected access
      
      CPolygon::set_values()    // public access
      CRectangle::set_values()  // public access

This is because we have used the public keyword to define the inheritance relationship on each of the derived classes:

      class CRectangle: public CPolygon { ... }

This public keyword after the colon (:) denotes the minimum access level for all the members inherited from the class that follows it (in this case CPolygon). Since public is the most accessible level, by specifying this keyword the derived class will inherit all the members with the same levels they had in the base class.

If we specify a more restrictive access level like protected, all public members of the base class are inherited as protected in the derived class. Whereas if we specify the most restricting of all access levels: private, all the base class members are inherited as private.

For example, if daughter was a class derived from mother that we defined as:

      class daughter: protected mother;

This would set protected as the maximum access level for the members of daughter that it inherited from mother. That is, all members that were public in mother would become protected in daughter. Of course, this would not restrict daughter to declare its own public members. That maximum access level is only set for the members inherited from mother.

If we do not explicitly specify any access level for the inheritance, the compiler assumes private for classes declared with class keyword and public for those declared with struct.

What is inherited from the base class?

In principle, a derived class inherits every member of a base class except:

Although the constructors and destructors of the base class are not inherited themselves, its default constructor (i.e., its constructor with no parameters) and its destructor are always called when a new object of a derived class is created or destroyed.

If the base class has no default constructor or you want that an overloaded constructor is called when a new derived object is created, you can specify it in each constructor definition of the derived class:

derived_constructor_name (parameters) : base_constructor_name (parameters) {...}

For example:

      // constructors and derived classes
      #include <iostream>
      using namespace std;
      
      class mother {
        public:
          mother ()
            { cout << "mother: no parameters\n"; }
          mother (int a)
            { cout << "mother: int parameter\n"; }
      };
      
      class daughter : public mother {
        public:
          daughter (int a)
            { cout << "daughter: int parameter\n\n"; }
      };
      
      class son : public mother {
        public:
          son (int a) : mother (a)
            { cout << "son: int parameter\n\n"; }
      };
      
      int main () {
        daughter cynthia (0);
        son daniel(0);
        
        return 0;
      }
Output
      mother: no parameters
      daughter: int parameter
       
      mother: int parameter
      son: int parameter

Notice the difference between which mother's constructor is called when a new daughter object is created and which when it is a son object. The difference is because the constructor declaration of daughter and son:

      daughter (int a)          // nothing specified: call default
      son (int a) : mother (a)  // constructor specified: call this

Multiple inheritance

In C++ it is perfectly possible that a class inherits members from more than one class. This is done by simply separating the different base classes with commas in the derived class declaration. For example, if we had a specific class to print on screen (COutput) and we wanted our classes CRectangle and CTriangle to also inherit its members in addition to those of CPolygon we could write:

      class CRectangle: public CPolygon, public COutput;
      class CTriangle: public CPolygon, public COutput; 

here is the complete example:

      // multiple inheritance
      #include <iostream>
      using namespace std;
      
      class CPolygon {
        protected:
          int width, height;
        public:
          void set_values (int a, int b)
            { width=a; height=b;}
        };
      
      class COutput {
        public:
          void output (int i);
        };
      
      void COutput::output (int i) {
        cout << i << endl;
        }
      
      class CRectangle: public CPolygon, public COutput {
        public:
          int area ()
            { return (width * height); }
        };
      
      class CTriangle: public CPolygon, public COutput {
        public:
          int area ()
            { return (width * height / 2); }
        };
        
      int main () {
        CRectangle rect;
        CTriangle trgl;
        rect.set_values (4,5);
        trgl.set_values (4,5);
        rect.output (rect.area());
        trgl.output (trgl.area());
        return 0;
      }
Output
      20
      10  

 

Polymorphism

Before getting into this section, it is recommended that you have a proper understanding of pointers and class inheritance. If any of the following statements seem strange to you, you should review the indicated sections:

Statement:

Explained in:

int a::b(c) {};

Classes

a->b

Pointers

class a: public b;

Friendship and inheritance

Pointers to base class

One of the key features of derived classes is that a pointer to a derived class is type-compatible with a pointer to its base class. Polymorphism is the art of taking advantage of this simple but powerful and versatile feature, that brings Object Oriented Methodologies to its full potential.

We are going to start by rewriting our program about the rectangle and the triangle of the previous section taking into consideration this pointer compatibility property:

      // pointers to base class
      #include <iostream>
      using namespace std;
      
      class CPolygon {
        protected:
          int width, height;
        public:
          void set_values (int a, int b)
            { width=a; height=b; }
        };
      
      class CRectangle: public CPolygon {
        public:
          int area ()
            { return (width * height); }
        };
      
      class CTriangle: public CPolygon {
        public:
          int area ()
            { return (width * height / 2); }
        };
      
      int main () {
        CRectangle rect;
        CTriangle trgl;
        CPolygon * ppoly1 = &rect;
        CPolygon * ppoly2 = &trgl;
        ppoly1->set_values (4,5);
        ppoly2->set_values (4,5);
        cout << rect.area() << endl;
        cout << trgl.area() << endl;
        return 0;
      }
Output
      20
      10

In function main, we create two pointers that point to objects of class CPolygon (ppoly1 and ppoly2). Then we assign references to rect and trgl to these pointers, and because both are objects of classes derived from CPolygon, both are valid assignations.

The only limitation in using *ppoly1 and *ppoly2 instead of rect and trgl is that both *ppoly1 and *ppoly2 are of type CPolygon* and therefore we can only use these pointers to refer to the members that CRectangle and CTriangle inherit from CPolygon. For that reason when we call the area() members at the end of the program we have had to use directly the objects rect and trgl instead of the pointers *ppoly1 and *ppoly2.

In order to use area() with the pointers to class CPolygon, this member should also have been declared in the class CPolygon, and not only in its derived classes, but the problem is that CRectangle and CTriangle implement different versions of area, therefore we cannot implement it in the base class. This is when virtual members become handy:

Virtual members

A member of a class that can be redefined in its derived classes is known as a virtual member. In order to declare a member of a class as virtual, we must precede its declaration with the keyword virtual:

      // virtual members
      #include <iostream>
      using namespace std;
      
      class CPolygon {
        protected:
          int width, height;
        public:
          void set_values (int a, int b)
            { width=a; height=b; }
          virtual int area ()
            { return (0); }
        };
      
      class CRectangle: public CPolygon {
        public:
          int area ()
            { return (width * height); }
        };
      
      class CTriangle: public CPolygon {
        public:
          int area ()
            { return (width * height / 2); }
        };
      
      int main () {
        CRectangle rect;
        CTriangle trgl;
        CPolygon poly;
        CPolygon * ppoly1 = &rect;
        CPolygon * ppoly2 = &trgl;
        CPolygon * ppoly3 = &poly;
        ppoly1->set_values (4,5);
        ppoly2->set_values (4,5);
        ppoly3->set_values (4,5);
        cout << ppoly1->area() << endl;
        cout << ppoly2->area() << endl;
        cout << ppoly3->area() << endl;
        return 0;
      }
Output
      20
      10
      0

Now the three classes (CPolygon, CRectangle and CTriangle) have all the same members: width, height, set_values() and area().

The member function area() has been declared as virtual in the base class because it is later redefined in each derived class. You can verify if you want that if you remove this virtual keyword from the declaration of area() within CPolygon, and then you run the program the result will be 0 for the three polygons instead of 20, 10 and 0. That is because instead of calling the corresponding area() function for each object (CRectangle::area(), CTriangle::area() and CPolygon::area(), respectively), CPolygon::area() will be called in all cases since the calls are via a pointer whose type is CPolygon*.

Therefore, what the virtual keyword does is to allow a member of a derived class with the same name as one in the base class to be appropriately called from a pointer, and more precisely when the type of the pointer is a pointer to the base class but is pointing to an object of the derived class, as in the above example.

A class that declares or inherits a virtual function is called a polymorphic class.

Note that despite of its virtuality, we have also been able to declare an object of type CPolygon and to call its own area() function, which always returns 0.

Abstract base classes

Abstract base classes are something very similar to our CPolygon class of our previous example. The only difference is that in our previous example we have defined a valid area() function with a minimal functionality for objects that were of class CPolygon (like the object poly), whereas in an abstract base classes we could leave that area() member function without implementation at all. This is done by appending =0 (equal to zero) to the function declaration.

An abstract base CPolygon class could look like this:

      // abstract class CPolygon
      class CPolygon {
        protected:
          int width, height;
        public:
          void set_values (int a, int b)
            { width=a; height=b; }
          virtual int area () =0;
      };

Notice how we appended =0 to virtual int area () instead of specifying an implementation for the function. This type of function is called a pure virtual function, and all classes that contain at least one pure virtual function are abstract base classes.

The main difference between an abstract base class and a regular polymorphic class is that because in abstract base classes at least one of its members lacks implementation we cannot create instances (objects) of it.

But a class that cannot instantiate objects is not totally useless; We can create pointers to it and take advantage of all its polymorphic abilities. Therefore a declaration like:

      CPolygon poly;

would not be valid for the abstract base class we have just declared, because tries to instantiate an object. Nevertheless, the following pointers:

      CPolygon * ppoly1;
      CPolygon * ppoly2;

would be perfectly valid.

This is so for as long as CPolygon includes a pure virtual function and therefore it's an abstract base class. However, pointers to this abstract base class can be used to point to objects of derived classes.

Here you have the complete example:

      // abstract base class
      #include <iostream>
      using namespace std;
      
      class CPolygon {
        protected:
          int width, height;
        public:
          void set_values (int a, int b)
            { width=a; height=b; }
          virtual int area (void) =0;
        };
      
      class CRectangle: public CPolygon {
        public:
          int area (void)
            { return (width * height); }
        };
      
      class CTriangle: public CPolygon {
        public:
          int area (void)
            { return (width * height / 2); }
        };
      
      int main () {
        CRectangle rect;
        CTriangle trgl;
        CPolygon * ppoly1 = &rect;
        CPolygon * ppoly2 = &trgl;
        ppoly1->set_values (4,5);
        ppoly2->set_values (4,5);
        cout << ppoly1->area() << endl;
        cout << ppoly2->area() << endl;
        return 0;
      }
Output
      20
      10

If you review the program you will notice that we refer to objects of different but related classes using a unique type of pointer (CPolygon*). This can be tremendously useful. For example, now we can create a function member of the abstract base class CPolygon that is able to print on screen the result of the area() function even though CPolygon itself has no implementation for this function:

      // pure virtual members can be called
      // from the abstract base class
      #include <iostream>
      using namespace std;
      
      class CPolygon {
        protected:
          int width, height;
        public:
          void set_values (int a, int b)
            { width=a; height=b; }
          virtual int area (void) =0;
          void printarea (void)
            { cout << this->area() << endl; }
        };
      
      class CRectangle: public CPolygon {
        public:
          int area (void)
            { return (width * height); }
        };
      
      class CTriangle: public CPolygon {
        public:
          int area (void)
            { return (width * height / 2); }
        };
      
      int main () {
        CRectangle rect;
        CTriangle trgl;
        CPolygon * ppoly1 = &rect;
        CPolygon * ppoly2 = &trgl;
        ppoly1->set_values (4,5);
        ppoly2->set_values (4,5);
        ppoly1->printarea();
        ppoly2->printarea();
        return 0;
      }
Output
      20
      10

Virtual members and abstract classes grant C++ the polymorphic characteristics that make object-oriented programming such a useful instrument in big projects. Of course, we have seen very simple uses of these features, but these features can be applied to arrays of objects or dynamically allocated objects.

Let's end with the same example again, but this time with objects that are dynamically allocated:

      // dynamic allocation and polymorphism
      #include <iostream>
      using namespace std;
      
      class CPolygon {
        protected:
          int width, height;
        public:
          void set_values (int a, int b)
            { width=a; height=b; }
          virtual int area (void) =0;
          void printarea (void)
            { cout << this->area() << endl; }
        };
      
      class CRectangle: public CPolygon {
        public:
          int area (void)
            { return (width * height); }
        };
      
      class CTriangle: public CPolygon {
        public:
          int area (void)
            { return (width * height / 2); }
        };
      
      int main () {
        CPolygon * ppoly1 = new CRectangle;
        CPolygon * ppoly2 = new CTriangle;
        ppoly1->set_values (4,5);
        ppoly2->set_values (4,5);
        ppoly1->printarea();
        ppoly2->printarea();
        delete ppoly1;
        delete ppoly2;
        return 0;
      }
Output
      20
      10

Notice that the ppoly pointers:

      CPolygon * ppoly1 = new CRectangle;
      CPolygon * ppoly2 = new CTriangle;

are declared being of type pointer to CPolygon but the objects dynamically allocated have been declared having the derived class type directly.

 

Advanced Concepts

 

Templates

Function templates

Function templates are special functions that can operate with generic types. This allows us to create a function template whose functionality can be adapted to more than one variable type or class without repeating the code for each type.

This is achieved through template parameters. A template parameter is a special kind of parameter that can be used to pass a type as parameter. These function templates can use these parameters as if they were regular types.

The format for declaring function templates with type parameters is:

template <class identifier> function_declaration;
template <typename identifier> function_declaration;

The only difference between both prototypes is the use of either the keyword class or the keyword typename. Its use is indistinct since both expressions have exactly the same meaning and behave exactly the same way.

For example, to create a template function that returns the greater one of two objects we could use:

      template <class myType>
      myType GetMax (myType a, myType b) {
       return (a>b?a:b);
      }

Here we have created a template function with myType as its template parameter. This template parameter represents a type that has not yet been specified, but that can be used in the template function as if it were a regular type. As you can see, the function template GetMax returns the greater of two parameters of this still-undefined type.

To use this function template we use the following format for the function call:

function_name <type> (parameters);

For example, to call GetMax to compare two integer values of type int we can write:

      int x,y;
      GetMax <int> (x,y);

When the compiler encounters this call to a template function, it uses the template to automatically generate a function replacing each appearance of myType by the type passed as the actual template parameter (int in this case) and then calls it. This process is automatically performed by the compiler and is invisible to the programmer.

Here is the entire example:

      // function template
      #include <iostream>
      using namespace std;
      
      template <class T>
      T GetMax (T a, T b) {
        T result;
        result = (a>b)? a : b;
        return (result);
      }
      
      int main () {
        int i=5, j=6, k;
        long l=10, m=5, n;
        k=GetMax<int>(i,j);
        n=GetMax<long>(l,m);
        cout << k << endl;
        cout << n << endl;
        return 0;
      }
Output
      6
      10

In this case we have used T as the template parameter name instead of myType because it is shorter and in fact is a very common template parameter name. But you can use any identifier you like.

In the example above we used the function template GetMax() twice. The first time with arguments of type int and the second one with arguments of type long. The compiler has instantiated and then called each time the appropriate version of the function.

As you can see, the type T is used within the GetMax() template function even to declare new objects of that type:

      T result;

Therefore, result will be an object of the same type as the parameters a and b when the function template is instantiated with a specific type.

In this specific case where the generic type T is used as a parameter for GetMax the compiler can find out automatically which data type has to instantiate without having to explicitly specify it within angle brackets (like we have done before specifying <int> and <long>). So we could have written instead:

      int i,j;
      GetMax (i,j);

Since both i and j are of type int, and the compiler can automatically find out that the template parameter can only be int. This implicit method produces exactly the same result:

      // function template II
      #include <iostream>
      using namespace std;
      
      template <class T>
      T GetMax (T a, T b) {
        return (a>b?a:b);
      }
      
      int main () {
        int i=5, j=6, k;
        long l=10, m=5, n;
        k=GetMax(i,j);
        n=GetMax(l,m);
        cout << k << endl;
        cout << n << endl;
        return 0;
      }
Output
      6
      10

Notice how in this case, we called our function template GetMax() without explicitly specifying the type between angle-brackets <>. The compiler automatically determines what type is needed on each call.

Because our template function includes only one template parameter (class T) and the function template itself accepts two parameters, both of this T type, we cannot call our function template with two objects of different types as arguments:

      int i;
      long l;
      k = GetMax (i,l); 

This would not be correct, since our GetMax function template expects two arguments of the same type, and in this call to it we use objects of two different types.

We can also define function templates that accept more than one type parameter, simply by specifying more template parameters. For example:

      template <class T, class U>
      T GetMin (T a, U b) {
        return (a<b?a:b);
      }

In this case, our function template GetMin() accepts two parameters of different types and returns an object of the same type as the first parameter (T) that is passed. For example, after that declaration we could call GetMin() with:

      int i,j;
      long l;
      i = GetMin<int,long> (j,l);

or simply:

      i = GetMin (j,l);

even though j and l have different types. But the compiler can determine the apropriate instantiation anyway.

Class templates

We also have the possibility to write class templates, so that a class can have members that use template parameters as types. For example:

      template <class T>
      class pair {
          T values [2];
        public:
          pair (T first, T second)
          {
            values[0]=first; values[1]=second;
          }
      };

The class that we have just defined serves to store two elements of any valid type. For example, if we wanted to declare an object of this class to store two integer values of type int with the values 115 and 36 we would write:

      pair<int> myobject (115, 36); 

this same class would also be used to create an object to store any other type:

      pair<float> myfloats (3.0, 2.18); 

The only member function in the previous class template has been defined inline within the class declaration itself. In case that we define a function member outside the declaration of the class template, we must always precede that definition with the template <...> prefix:

      // class templates
      #include <iostream>
      using namespace std;
      
      template <class T>
      class pair {
          T a, b;
        public:
          pair (T first, T second)
            {a=first; b=second;}
          T getmax ();
      };
      
      template <class T>
      T pair<T>::getmax ()
      {
        T retval;
        retval = a>b? a : b;
        return retval;
      }
      
      int main () {
        pair <int> myobject (100, 75);
        cout << myobject.getmax();
        return 0;
      }
Output
      100

Notice the syntax of the definition of member function getmax:

      template <class T>
      T pair<T>::getmax () 

Confused by so many T's? There are three T's in this declaration. The first one is the template parameter. The second T refers to the type returned by the function. And the third T (the one between angle brackets) is also a requirement. It specifies that this function's template parameter is also the class template parameter.

Template specialization

If we want to define a different implementation for a template when a specific type is passed as template parameter, we can declare a specialization of that template.

For example, let's suppose that we have a very simple class called container that can store one element of any type and that it has just one member function called increase, that increases its value. But we find that when it stores an element of type char it would be more convenient to have a completely different implementation with a function member uppercase, so we decide to declare a class template specialization for that type:

      // template specialization
      #include <iostream>
      using namespace std;
      
      template <class T>
      class container {
          T element;
        public:
          container (T arg) {element=arg;}
          T increase () {return ++element;}
      };
      
      template <>
      class container <char> {
          char element;
        public:
          container (T arg) {element=arg;}
          char uppercase ();
      };
      
      template <>
      char container<char>::uppercase()
      {
        if ((element>='a')&&(element<='z'))
        element+='A'-'a';
        return element;
      }
      
      int main () {
        container<int> myint (7);
        container<char> mychar ('j');
        cout << myint.increase() << endl;
        cout << mychar.uppercase() << endl;
        return 0;
      }
Output
      8
      J

This is the syntax used in the class template specialization:

      template <> class container <char> { ... };

First of all, notice that we precede the class template name with template<>. This is a requirement in standard C++ for all template specializations, although some compilers, as an anachronism, support to declare template specialization without this prefix.

But more important than this prefix, is the <char> specialization parameter after the class template name. This specialization parameter itself identifies the type for which we are going to declare a template class specialization (char). Notice the differences between the generic class template and the specialization:

      template <class T> class container { ... };
      template <> class container <char> { ... };

The first line is the generic template, and the second one is the specialization.

When we declare specializations for a template class, we must also define all its members, because there is no inheritance of members from the generic template to the specialization.

Non-type parameters for templates

Besides the template arguments that are preceded by the class or typename keywords , which represent types, templates can also have regular typed parameters, similar to those found in functions. As an example, have a look at this class template that is used to contain sequences of elements:

      // sequence template
      #include <iostream>
      using namespace std;
      
      template <class T, int N>
      class sequence {
          T memblock [N];
        public:
          void setmember (int x, T value);
          T getmember (int x);
      };
      
      template <class T, int N>
      void sequence<T,N>::setmember (int x, T value) {
        memblock[x]=value;
      }
      
      template <class T, int N>
      T sequence<T,N>::getmember (int x) {
        return memblock[x];
      }
      
      int main () {
        sequence <int,5> myints;
        sequence <double,5> myfloats;
        myints.setmember (0,100);
        myfloats.setmember (3,3.1416);
        cout << myints.getmember(0) << '\n';
        cout << myfloats.getmember(3) << '\n';
        return 0;
      }
Output
      100
      3.1416 

It is also possible to set default values or types for class template parameters. For example, if the previous class template definition had been:

      template <class T=char, int N=10> class sequence {..};

We could create objects using the default template parameters by declaring:

      sequence<> myseq;

Which would be equivalent to:

      sequence<char,10> myseq;

Templates and multiple-file projects

From the point of view of the compiler, templates are not normal functions or classes. They are compiled on demand, meaning that the code of a template function is not compiled until an instantiation with specific template arguments is required. At that moment, when an instantiation is required, the compiler generates a function specifically for those arguments from the template.

When projects grow it is usual to split the code of a program in different source code files. In these cases, the interface and implementation are generally separated. Taking a library of functions as example, the interface generally consists of declarations of the prototypes of all the functions that can be called. These are generally declared in a "header file" with a .h extension, and the implementation (the definition of these functions) is in an independent file with c++ code.

Because templates are compiled when required, this forces a restriction for multi-file projects: the implementation (definition) of a template class or function must be in the same file as its declaration. That means that we cannot separate the interface in a separate header file, and that we must include both interface and implementation in any file that uses the templates.

Since no code is generated until a template is instantiated when required, compilers are prepared to allow the inclusion more than once of the same template file with both declarations and definitions in a project without generating linkage errors.

 

Namespaces

Namespaces allow to group entities like classes, objects and functions under a name. This way the global scope can be divided in "sub-scopes", each one with its own name.

The format of namespaces is:

namespace identifier
{
entities
}

Where identifier is any valid identifier and entities is the set of classes, objects and functions that are included within the namespace. For example:

      namespace myNamespace
      {
        int a, b;
      }

In this case, the variables a and b are normal variables declared within a namespace called myNamespace. In order to access these variables from outside the myNamespace namespace we have to use the scope operator ::. For example, to access the previous variables from outside [myNamespace=] we can write:

      general::a
      general::b 

The functionality of namespaces is especially useful in the case that there is a possibility that a global object or function uses the same identifier as another one, causing redefinition errors. For example:

      // namespaces
      #include <iostream>
      using namespace std;
      
      namespace first
      {
        int var = 5;
      }
      
      namespace second
      {
        double var = 3.1416;
      }
      
      int main () {
        cout << first::var << endl;
        cout << second::var << endl;
        return 0;
      }
Output
      5
      3.1416

In this case, there are two global variables with the same name: var. One is defined within the namespace first and the other one in second. No redefinition errors happen thanks to namespaces.

using

The keyword using is used to introduce a name from a namespace into the current declarative region. For example:

      // using
      #include <iostream>
      using namespace std;
      
      namespace first
      {
        int x = 5;
        int y = 10;
      }
      
      namespace second
      {
        double x = 3.1416;
        double y = 2.7183;
      }
      
      int main () {
        using first::x;
        using second::y;
        cout << x << endl;
        cout << y << endl;
        cout << first::y << endl;
        cout << second::x << endl;
        return 0;
      }
Output
      5
      2.7183
      10
      3.1416

Notice how in this code, x (without any name qualifier) refers to first::x whereas y refers to second::y, exactly as our using declarations have specified. We still have access to first::y and second::x using their fully qualified names.

The keyword using can also be used as a directive to introduce an entire namespace:

      // using
      #include <iostream>
      using namespace std;
      
      namespace first
      {
        int x = 5;
        int y = 10;
      }
      
      namespace second
      {
        double x = 3.1416;
        double y = 2.7183;
      }
      
      int main () {
        using namespace first;
        cout << x << endl;
        cout << y << endl;
        cout << second::x << endl;
        cout << second::y << endl;
        return 0;
      }
Output
      5
      10
      3.1416
      2.7183

In this case, since we have declared that we were using namespace first, all direct uses of x and y without name qualifiers were referring to their declarations in namespace first.

using and using namespace have validity only in the same block in which they are stated or in the entire code if they are used directly in the global scope. For example, if we had the intention to first use the objects of one namespace and then those of another one, we could do something like:

      // using namespace example
      #include <iostream>
      using namespace std;
      
      namespace first
      {
        int x = 5;
      }
      
      namespace second
      {
        double x = 3.1416;
      }
      
      int main () {
        {
          using namespace first;
          cout << x << endl;
        }
        {
          using namespace second;
          cout << x << endl;
        }
        return 0;
      }
Output
      5
      3.1416

Namespace alias

We can declare alternate names for existing namespaces according to the following format:

namespace new_name = current_name;

Namespace std

All the files in the C++ standard library declare all of its entities within the std namespace. That is why we have generally included the using namespace std; statement in all programs that used any entity defined in iostream.

 

Exceptions

Exceptions provide a way to react to exceptional circumstances (like runtime errors) in our program by transferring control to special functions called handlers.

To catch exceptions we must place a portion of code under exception inspection. This is done by enclosing that portion of code in a try block. When an exceptional circumstance arises within that block, an exception is thrown that transfers the control to the exception handler. If no exception is thrown, the code continues normally and all handlers are ignored.

A exception is thrown by using the throw keyword from inside the try block. Exception handlers are declared with the keyword catch, which must be placed immediately after the try block:

      // exceptions
      #include <iostream>
      using namespace std;
      
      int main () {
        try
        {
          throw 20;
        }
        catch (int e)
        {
          cout << "An exception occurred. Exception Nr. " << e << endl;
        }
        return 0;
      }
Output
      An exception occurred. Exception Nr. 20

The code under exception handling is enclosed in a try block. In this example this code simply throws an exception:

      throw 20;

A throw expression accepts one parameter (in this case the integer value 20), which is passed as an argument to the exception handler.

The exception handler is declared with the catch keyword. As you can see, it follows immediately the closing brace of the try block. The catch format is similar to a regular function that always has at least one parameter. The type of this parameter is very important, since the type of the argument passed by the throw expression is checked against it, and only in the case they match, the exception is catch.

We can chain multiple handlers (catch expressions), each one with a different parameter type. Only the handler that matches its type with the argument specified in the throw statement is executed.

If we use an ellipsis (...) as the parameter of catch, that handler will catch any exception no matter what the type of the throw exception is. This can be used as a default handler that catches all exceptions not caught by other handlers if it is specified at last:

      try {
        // code here
      }
      catch (int param) { cout << "int exception"; }
      catch (char param) { cout << "char exception"; }
      catch (...) { cout << "default exception"; }

In this case the last handler would catch any exception thrown with any parameter that is neither an int nor a char.

After an exception has been handled the program execution resumes after the try-catch block, not after the throw statement!.

It is also possible to nest try-catch blocks within more external try blocks. In these cases, we have the possibility that an internal catch block forwards the exception to its external level. This is done with the expression throw; with no arguments. For example:

      try {
        try {
            // code here
        }
        catch (int n) {
            throw;
        }
      }
      catch (...) {
        cout << "Exception occurred";
      }

Standard exceptions

The C++ Standard library provides a base class specifically designed to declare objects to be thrown as exceptions. It is called exception and is defined in the <exception> header file under the namespace std. This class has the usual default and copy constructors, operators and destructors, plus an additional virtual member function called what that returns a null-terminated character sequence (char *) and that can be redefined in derived class to contain some sort of description of the exception.

      // standard exceptions
      #include <iostream>
      #include <exception>
      using namespace std;
      
      class myexception: public exception
      {
        virtual const char* what() const
        {
          return "My exception happened";
        }
      } myex;
      
      int main () {
        try
        {
          throw myex;
        }
        catch (exception& e)
        {
          cout << e.what() << endl;
        }
        return 0;
      }
Output
      My exception happened.

We have placed a handler that catches exception objects by reference (notice the ampersand & after the type), therefore this catches also classes derived from exception, like our myex object of class myexception.

All exceptions thrown by components of the C++ Standard library throw exceptions derived from this std::exception class. These are:

exception

description

bad_alloc

thrown by new on allocation failure

bad_cast

thrown by dynamic_cast when fails with a referenced type

bad_exception

thrown when an exception type doesn't match any catch

bad_typeid

thrown by typeid

ios_base::failure

thrown by functions in the iostream library

For example, if we use the operator new without (nothrow) and the memory cannot be allocated, an exception of type bad_alloc is thrown:

      try
      {
        int * myarray= new int[1000];
      }
      catch (bad_alloc&)
      {
        cout << "Error allocating memory." << endl;
      }

It is recommended to include all dynamic memory allocations within a try block that catches this type of exception to perform a clean action instead of an abnormal program termination, which is what happens when this type of exception is thrown and not caught. You can force a bad allocation exception by trying to allocate a huge array; On my system, trying to allocate 1 billion ints threw a bad_alloc exception.

Because bad_alloc is derived from the standard base class exception, we can handle that same exception by catching references to the exception class:

      // bad_alloc standard exception
      #include <iostream>
      #include <exception>
      using namespace std;
      
      int main () {
        try
        {
          int* myarray= new int[1000];
        }
        catch (exception& e)
        {
          cout << "Standard exception: " << e.what() << endl;
        }
        return 0;
      }

Exception specifications

When declaring a function we can limit the exception type it might directly or indirectly throw by appending a throw suffix to the function declaration:

      float myfunction (char param) throw (int);

This declares a function called myfunction which takes one agument of type char and returns an element of type float. The only exception that this function might throw is an exception of type int. If it throws an exception with a different type, either directly or indirectly, it cannot be caught by a regular int-type handler.

If this throw specifier is left empty with no type, this means the function is not allowed to throw exceptions. Functions with no throw specifier (regular functions) are allowed to throw exceptions with any type:

      int myfunction (int param) throw(); // no exceptions allowed
      int myfunction (int param);         // all exceptions allowed

 

 

Type Casting

Converting an expression of a given type into another type is known as type-casting. We have already seen some ways to type cast:

Implicit conversion

Implicit conversions do not require any operator. They are automatically performed when a value is copied to a compatible type. For example:

      short a=2000;
      int b;
      b=a;

Here, the value of a has been promoted from short to int and we have not had to specify any type-casting operator. This is known as a standard conversion. Standard conversions affect fundamental data types, and allow conversions such as the conversions between numerical types (short to int, int to float, double to int...), to or from bool, and some pointer conversions. Some of these conversions may imply a loss of precision, which the compiler can signal with a warning. This can be avoided with an explicit conversion.

Implicit conversions also include constructor or operator conversions, which affect classes that include specific constructors or operator functions to perform conversions. For example:

      class A {};
      class B { public: B (A a) {} };
      
      A a;
      B b=a;

Here, a implicit conversion happened between objects of class A and class B, because B has a constructor that takes an object of class A as parameter. Therefore implicit conversions from A to B are allowed.

Explicit conversion

C++ is a strong-typed language. Many conversions, specially those that imply a different interpretation of the value, require an explicit conversion. We have already seen two notations for explicit type conversion: functional and c-like casting:

      short a=2000;
      int b;
      b = (int) a;    // c-like cast notation
      b = int (a);    // functional notation

The functionality of these explicit conversion operators is enough for most needs with fundamental data types. However, these operators can be applied indiscriminately on classes and pointers to classes, which can lead to code that while being syntactically correct can cause runtime errors. For example, the following code is syntactically correct:

      // class type-casting
      #include <iostream>
      using namespace std;
      
      class CDummy {
          float i,j;
      };
      
      class CAddition {
            int x,y;
        public:
            CAddition (int a, int b) { x=a; y=b; }
            int result() { return x+y;}
      };
      
      int main () {
        CDummy d;
        CAddition * padd;
        padd = (CAddition*) &d;
        cout << padd->result();
        return 0;
      }
       

The program declares a pointer to CAddition, but then it assigns to it a reference to an object of another incompatible type using explicit type-casting:

      padd = (CAddition*) &d;

Traditional explicit type-casting allows to convert any pointer into any other pointer type, independently of the types they point to. The subsequent call to member result will produce either a run-time error or a unexpected result.

In order to control these types of conversions between classes, we have four specific casting operators: dynamic_cast, reinterpret_cast, static_cast and const_cast. Their format is to follow the new type enclosed between angle-brackets (<>) and immediately after, the expression to be converted between parentheses.

dynamic_cast <new_type> (expression)
reinterpret_cast <new_type> (expression)
static_cast <new_type> (expression)
const_cast <new_type> (expression)

The traditional type-casting equivalents to these expressions would be:

(new_type) expression
new_type (expression)

but each one with its own special characteristics:

dynamic_cast

dynamic_cast can be used only with pointers and references to objects. Its purpose is to ensure that the result of the type conversion is a valid complete object of the requested class.

Therefore, dynamic_cast is always successful when we cast a class to one of its base classes:

      class CBase { };
      class CDerived: public CBase { };
      
      CBase b; CBase* pb;
      CDerived d; CDerived* pd;
      
      pb = dynamic_cast<CBase*>(&d);     // ok: derived-to-base
      pd = dynamic_cast<CDerived*>(&b);  // wrong: base-to-derived

The second conversion in this piece of code would produce a compilation error since base-to-derived conversions are not allowed with dynamic_cast unless the base class is polymorphic.

When a class is polymorphic, dynamic_cast performs a special checking during runtime to ensure that the expression yields a valid complete object of the requested class:

      // dynamic_cast
      #include <iostream>
      #include <exception>
      using namespace std;
      
      class CBase { virtual dummy() {} };
      class CDerived: public CBase { int a; };
      
      int main () {
        try {
          CBase * pba = new CDerived;
          CBase * pbb = new CBase;
          CDerived * pd;
      
          pd = dynamic_cast<CDerived*>(pba);
          if (pd==0) cout << "Null pointer on first type-cast" << endl;
      
          pd = dynamic_cast<CDerived*>(pbb);
          if (pd==0) cout << "Null pointer on second type-cast" << endl;
      
        } catch (exception& e) {cout << "Exception: " << e.what();}
        return 0;
      }
Output
      Null pointer on second type-cast

Compatibility note: dynamic_cast requires the Run-Time Type Information (RTTI) to keep track of dynamic types. Some compilers include this feature as an option which is disabled by default. This feature must be enabled for runtime type checking using dynamic_cast.

The code tries to perform two dynamic casts from pointer objects of type CBase* (pba and pbb) to a pointer object of type CDerived*, but only the second one is successful. Notice their respective initializations:

      CBase * pba = new CDerived;
      CBase * pbb = new CBase;

Even though both are pointers of type CBase*, pba points to an object of type CDerived, while pbb points to an object of type CBase. Thus, when their respective type-castings are performed using dynamic_cast, pba is pointing to a full object of class CDerived, whereas pbb is pointing to an object of class CBase, which is an incomplete object of class CDerived.

When dynamic_cast cannot cast a pointer because it is not a complete object of the required class -as in the second conversion in the previous example- it returns a null pointer to indicate the failure. If dynamic_cast is used to convert to a reference type and the conversion is not possible, an exception of type bad_alloc is thrown instead.

dynamic_cast can also cast null pointers even between pointers to unrelated classes, and can also cast pointers of any type to void pointers (void*).

static_cast

static_cast can perform conversions between pointers to related classes, not only from the derived class to its base, but also from a base class to its derived. This ensures that at least the classes are compatible if the proper object is converted, but no safety check is performed during runtime to check if the object being converted is in fact a full object of the destination type. Therefore, it is up to the programmer to ensure that the conversion is safe. On the other side, the overhead of the type-safety checks of dynamic_cast is avoided.

      class CBase {};
      class CDerived: public CBase {};
      CBase * a = new CBase;
      CDerived * b = static_cast<CDerived*>(a);

This would be valid, although b would point to an incomplete object of the class and could lead to runtime errors if dereferenced.

static_cast can also be used to perform any other non-pointer conversion that could also be performed implicitly, like for example standard conversion between fundamental types:

      double d=3.14159265;
      int i = static_cast<int>(d); 

Or any conversion between classes with explicit constructors or operator functions as described in "implicit conversions" above.

reinterpret_cast

reinterpret_cast converts any pointer type to any other pointer type, even of unrelated classes. The operation result is a simple binary copy of the value from one pointer to the other. All pointer conversions are allowed: neither the content pointed nor the pointer type itself is checked.

It can also cast pointers to or from integer types. The format in which this integer value represents a pointer is platform-specific. The only guarantee is that a pointer cast to an integer type large enough to fully contain it, is granted to be able to be cast back to a valid pointer.

The conversions that can be performed by reinterpret_cast but not by static_cast have no specific uses in C++ are low-level operations, whose interpretation results in code which is generally system-specific, and thus non-portable. For example:

      class A {};
      class B {};
      A * a = new A;
      B * b = reinterpret_cast<B*>(a);

This is valid C++ code, although it does not make much sense, since now we have a pointer that points to an object of an incompatible class, and thus dereferencing it is unsafe.

const_cast

This type of casting manipulates the constness of an object, either to be set or to be removed. For example, in order to pass a const argument to a function that expects a non-constant parameter:

      // const_cast
      #include <iostream>
      using namespace std;
      
      void print (char * str)
      {
        cout << str << endl;
      }
      
      int main () {
        const char * c = "sample text";
        print ( const_cast<char *> (c) );
        return 0;
      }
Output
      sample text

typeid

typeid allows to check the type of an expression:

typeid (expression)

This operator returns a reference to a constant object of type type_info that is defined in the standard header file <typeinfo>. This returned value can be compared with another one using operators == and != or can serve to obtain a null-terminated character sequence representing the data type or class name by using its name() member.

      // typeid
      #include <iostream>
      #include <typeinfo>
      using namespace std;
      
      int main () {
        int * a,b;
        a=0; b=0;
        if (typeid(a) != typeid(b))
        {
          cout << "a and b are of different types:\n";
          cout << "a is: " << typeid(a).name() << '\n';
          cout << "b is: " << typeid(b).name() << '\n';
        }
        return 0;
      }
Output
      a and b are of different types:
      a is: int *
      b is: int  

When typeid is applied to classes typeid uses the RTTI to keep track of the type of dynamic objects. When typeid is applied to an expression whose type is a polymorphic class, the result is the type of the most derived complete object:

      // typeid, polymorphic class
      #include <iostream>
      #include <typeinfo>
      #include <exception>
      using namespace std;
      
      class CBase {virtual f(){} };
      class CDerived : public CBase {};
      
      int main () {
        try {
          CBase* a = new CBase;
          CBase* b = new CDerived;
          cout << "a is: " << typeid(a).name() << '\n';
          cout << "b is: " << typeid(b).name() << '\n';
          cout << "*a is: " << typeid(*a).name() << '\n';
          cout << "*b is: " << typeid(*b).name() << '\n';
        } catch (exception& e) { cout << "Exception: " << e.what()
      << endl; }
        return 0;
      }
Output
      a is: class CBase *
      b is: class CBase *
      *a is: class CBase
      *b is: class CDerived

Notice how the type that typeid considers for pointers is the pointer type itself (both a and b are of type class CBase *). However, when typeid is applied to objects (like *a and *b) typeid yields their dynamic type (i.e. the type of their most derived complete object).

If the type typeid evaluates is a pointer preceded by the dereference operator (*), and this pointer has a null value, typeid

 

 

Preprocessor directives

Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor. These lines are always preceded by a pound sign (#). The preprocessor is executed before the actual compilation of code begins, therefore the preprocessor digests all these directives before any code is generated by the statements.

These preprocessor directives extend only across a single line of code. As soon as a newline character is found, the preprocessor directive is considered to end. No semicolon (;) is expected at the end of a preprocessor directive. The only way a preprocessor directive can extend through more than one line is by preceding the newline character at the end of the line by a backslash (\).

macro definitions (#define, #undef)

To define preprocessor macros we can use #define. Its format is:

#define identifier replacement

When the preprocessor encounters this directive, it replaces any occurrence of identifier in the rest of the code by replacement. This replacement can be an expression, a statement, a block or simply anything. The preprocessor does not understand C++, it simply replaces any occurrence of identifier by replacement.

      #define TABLE_SIZE 100
      int table1[TABLE_SIZE];
      int table2[TABLE_SIZE]; 

After the preprocessor has replaced TABLE_SIZE, the code becomes equivalent to:

      int table1[100];
      int table2[100]; 

This use of #define as constant definer is already known by us from previuos tutorials, but #define can work also with parameters to define function macros:

      #define getmax(a,b) a>b?a:b

This would replace any occurrence of getmax followed by two arguments by the replacement expression, but also replacing each argument by its identifier, exactly as you would expect if it was a function:

      // function macro
      #include <iostream>
      using namespace std;
      
      #define getmax(a,b) ((a)>(b)?(a):(b))
      
      int main()
      {
        int x=5, y;
        y= getmax(x,2);
        cout << y << endl;
        cout << getmax(7,x) << endl;
        return 0;
      }
Output
      5
      7

Defined macros are not affected by block structure. A macro lasts until it is undefined with the #undef preprocessor directive:

      #define TABLE_SIZE 100
      int table1[TABLE_SIZE];
      #undef TABLE_SIZE
      #define TABLE_SIZE 200
      int table2[TABLE_SIZE];

This would generate the same code as:

      int table1[100];
      int table2[200];

Function macro definitions accept two special operators (# and ##) in the replacement sequence:
If the operator
# is used before a parameter is used in the replacement sequence, that parameter is replaced by a string literal (as if it were enclosed between double quotes)

      #define str(x) #x
      cout << str(test);

This would be translated into:

      cout << "test";

The operator ## concatenates two arguments leaving no blank spaces between them:

      #define glue(a,b) a ## b
      glue(c,out) << "test";

This would also be translated into:

      cout << "test";

Because preprocessor replacements happen before any C++ syntax check, macro definitions can be a tricky feature, but be careful: code that relies heavily on complicated macros may result obscure to other programmers, since the syntax they expect is on many occasions different from the regular expressions programmers expect in C++.

Conditional inclusions (#ifdef, #ifndef, #if, #endif, #else and #elif)

These directives allow to include or discard part of the code of a program if a certain condition is met.

#ifdef allows a section of a program to be compiled only if the macro that is specified as the parameter has been defined, no matter which its value is. For example:

      #ifdef TABLE_SIZE
      int table[TABLE_SIZE];
      #endif 

In this case, the line of code int table[TABLE_SIZE]; is only compiled if TABLE_SIZE was previously defined with #define, independently of its value. If it was not defined, that line will not be included in the program compilation.

#ifndef serves for the exact opposite: the code between #ifndef and #endif directives is only compiled if the specified identifier has not been previously defined. For example:

      #ifndef TABLE_SIZE
      #define TABLE_SIZE 100
      #endif
      int table[TABLE_SIZE];

In this case, if when arriving at this piece of code, the TABLE_SIZE macro has not been defined yet, it would be defined to a value of 100. If it already existed it would keep its previous value since the #define directive would not be executed.

The #if, #else and #elif (i.e., "else if") directives serve to specify some condition to be met in order for the portion of code they surround to be compiled. The condition that follows #if or #elif can only evaluate constant expressions, including macro expressions. For example:

      #if TABLE_SIZE>200
      #undef TABLE_SIZE
      #define TABLE_SIZE 200
       
      #elif TABLE_SIZE<50
      #undef TABLE_SIZE
      #define TABLE_SIZE 50
       
      #else
      #undef TABLE_SIZE
      #define TABLE_SIZE 100
      #endif
       
      int table[TABLE_SIZE]; 

Notice how the whole structure of #if, #elif and #else chained directives ends with #endif.

The behavior of #ifdef and #ifndef can also be achieved by using the special operators defined and !defined respectively in any #if or #elif directive:

      #if !defined TABLE_SIZE
      #define TABLE_SIZE 100
      #elif defined ARRAY_SIZE
      #define TABLE_SIZE ARRAY_SIZE
      int table[TABLE_SIZE];

Line control (#line)

When we compile a program and some error happen during the compiling process, the compiler shows an error message with references to the name of the file where the error happened and a line number, so it is easier to find the code generating the error.

The #line directive allows us to control both things, the line numbers within the code files as well as the file name that we want that appears when an error takes place. Its format is:

#line number "filename"

Where number is the new line number that will be assigned to the next code line. The line numbers of successive lines will be increased one by one from this point on.

"filename" is an optional parameter that allows to redefine the file name that will be shown. For example:

      #line 20 "assigning variable"
      int a?; 

This code will generate an error that will be shown as error in file "assigning variable", line 20.

Error directive (#error)

This directive aborts the compilation process when it is found, generating a compilation the error that can be specified as its parameter:

      #ifndef __cplusplus
      #error A C++ compiler is required!
      #endif

This example aborts the compilation process if the macro name __cplusplus is not defined (this macro name is defined by default in all C++ compilers).

Source file inclusion (#include)

This directive has also been used assiduously in other sections of this tutorial. When the preprocessor finds an #include directive it replaces it by the entire content of the specified file. There are two ways to specify a file to be included:

      #include "file"
      #include <file>

The only difference between both expressions is the places (directories) where the compiler is going to look for the file. In the first case where the file name is specified between double-quotes, the file is searched first in the same directory that includes the file containing the directive. In case that it is not there, the compiler searches the file in the default directories where it is configured to look for the standard header files.
If the file name is enclosed between angle-brackets
<> the file is searched directly where the compiler is configured to look for the standard header files. Therefore, standard header files are usually included in angle-brackets, while other specific header files are included using quotes.

Pragma directive (#pragma)

This directive is used to specify diverse options to the compiler. These options are specific for the platform and the compiler you use. Consult the manual or the reference of your compiler for more information on the possible parameters that you can define with #pragma.

If the compiler does not support a specific argument for #pragma, it is ignored - no error is generated.

Predefined macro names

The following macro names are defined at any time:

macro

value

__LINE__

Integer value representing the current line in the source code file being compiled.

__FILE__

A string literal containing the presumed name of the source file being compiled.

__DATE__

A string literal in the form "Mmm dd yyyy" containing the date in which the compilation process began.

__TIME__

A string literal in the form "hh:mm:ss" containing the time at which the compilation process began.

__cplusplus

An integer value. All C++ compilers have this constant defined to some value. If the compiler is fully compliant with the C++ standard its value is equal or greater than 199711L depending on the version of the standard they comply.

For example:

      // standard macro names
      #include <iostream>
      using namespace std;
      
      int main()
      {
        cout << "This is the line number " << __LINE__;
        cout << " of file " << __FILE__ << ".\n";
        cout << "Its compilation began " << __DATE__;
        cout << " at " << __TIME__ << ".\n";
        cout << "The compiler gives a __cplusplus value of " << __cplusplus;
        return 0;
      }
Output
      This is the line number 7 of file /home/jay/stdmacronames.cpp.
      Its compilation began Nov  1 2005 at 10:12:29.
      The compiler gives a __cplusplus 

 

C++ Standard Library

Input/Output with files

C++ provides the following classes to perform output and input of characters to/from files:

These classes are derived directly or indirectly from the classes istream, and ostream. We have already used objects whose types were these classes: cin is an object of class istream and cout is an object of class ostream. Therfore, we have already been using classes that are related to our file streams. And in fact, we can use our file streams the same way we are already used to use cin and cout, with the only difference that we have to associate these streams with physical files. Let's see an example:

      // basic file operations
      #include <iostream>
      #include <fstream>
      using namespace std;
      
      int main () {
        ofstream myfile;
        myfile.open ("example.txt");
        myfile << "Writing this to a file.\n";
        myfile.close();
        return 0;
      }
Output
      [file example.txt]
      Writing this to a file

This code creates a file called example.txt and inserts a sentence into it in the same way we are used to do with cout, but using the file stream myfile instead.

But let's go step by step:

Open a file

The first operation generally performed on an object of one of these classes is to associate it to a real file. This procedure is known as to open a file. An open file is represented within a program by a stream object (an instantiation of one of these classes, in the previous example this was myfile) and any input or output operation performed on this stream object will be applied to the physical file associated to it.

In order to open a file with a stream object we use its member function open():

open (filename, mode);

Where filename is a null-terminated character sequence of type const char * (the same type that string literals have) representing the name of the file to be opened, and mode is an optional parameter with a combination of the following flags:

ios::in

Open for input operations.

ios::out

Open for output operations.

ios::binary

Open in binary mode.

ios::ate

Set the initial position at the end of the file.
If this flag is not set to any value, the initial position is the beginning of the file.

ios::app

All output operations are performed at the end of the file, appending the content to the current content of the file. This flag can only be used in streams open for output-only operations.

ios::trunc

If the file opened for output operations already existed before, its previous content is deleted and replaced by the new one.

All these flags can be combined using the bitwise operator OR (|). For example, if we want to open the file example.bin in binary mode to add data we could do it by the following call to member function open():

      ofstream myfile;
      myfile.open ("example.bin", ios::out | ios::app | ios::binary); 

Each one of the open() member functions of the classes ofstream, ifstream and fstream has a default mode that is used if the file is opened without a second argument:

class

default mode parameter

ofstream

ios::out

ifstream

ios::in

fstream

ios::in | ios::out

For ifstream and ofstream classes, ios::in and ios::out are automatically and respectivelly assumed, even if a mode that does not include them is passed as second argument to the open() member function.

The default value is only applied if the function is called without specifying any value for the mode parameter. If the function is called with any value in that parameter the default mode is overridden, not combined.

File streams opened in binary mode perform input and output operations independently of any format considerations. Non-binary files are known as text files, and some translations may occur due to formatting of some special characters (like newline and carriage return characters).

Since the first task that is performed on a file stream object is generally to open a file, these three classes include a constructor that automatically calls the open() member function and has the exact same parameters as this member. Therefor, we could also have declared the previous myfile object and conducted the same opening operation in our previous example by writing:

      ofstream myfile ("example.bin", ios::out | ios::app | ios::binary);

Combining object construction and stream opening in a single statement. Both forms to open a file are valid and equivalent.

To check if a file stream was successful opening a file, you can do it by calling to member is_open() with no arguments. This member function returns a bool value of true in the case that indeed the stream object is associated with an open file, or false otherwise:

      if (myfile.is_open()) { /* ok, proceed with output */ }

Closing a file

When we are finished with our input and output operations on a file we shall close it so that its resources become available again. In order to do that we have to call the stream's member function close(). This member function takes no parameters, and what it does is to flush the associated buffers and close the file:

      myfile.close();

Once this member function is called, the stream object can be used to open another file, and the file is available again to be opened by other processes.

In case that an object is destructed while still associated with an open file, the destructor automatically calls the member function close().

Text files

Text file streams are those where we do not include the ios::binary flag in their opening mode. These files are designed to store text and thus all values that we input or output from/to them can suffer some formatting transformations, which do not necessarily correspond to their literal binary value.

Data output operations on text files are performed in the same way we operated with cout:

      // writing on a text file
      #include <iostream>
      #include <fstream>
      using namespace std;
      
      int main () {
        ofstream myfile ("example.txt");
        if (myfile.is_open())
        {
          myfile << "This is a line.\n";
          myfile << "This is another line.\n";
          myfile.close();
        }
        else cout << "Unable to open file";
        return 0;
      }
Output
      [file example.txt]
      This is a line.
      This is another line.

Data input from a file can also be performed in the same way that we did with cin:

      // reading a text file
      #include <iostream>
      #include <fstream>
      #include <string>
      using namespace std;
      
      int main () {
        string line;
        ifstream myfile ("example.txt");
        if (myfile.is_open())
        {
          while (! myfile.eof() )
          {
            getline (myfile,line);
            cout << line << endl;
          }
          myfile.close();
        }
      
        else cout << "Unable to open file"; 
      
        return 0;
      }
Output
      This is a line.
      This is another line.  

This last example reads a text file and prints out its content on the screen. Notice how we have used a new member function, called eof() that returns true in the case that the end of the file has been reached. We have created a while loop that finishes when indeed myfile.eof() becomes true (i.e., the end of the file has been reached).

Checking state flags

In addition to eof(), which checks if the end of file has been reached, other member functions exist to check the state of a stream (all of them return a bool value):

bad()

Returns true if a reading or writing operation fails. For example in the case that we try to write to a file that is not open for writing or if the device where we try to write has no space left.

fail()

Returns true in the same cases as bad(), but also in the case that a format error happens, like when an alphabetical character is extracted when we are trying to read an integer number.

eof()

Returns true if a file open for reading has reached the end.

good()

It is the most generic state flag: it returns false in the same cases in which calling any of the previous functions would return true.

In order to reset the state flags checked by any of these member functions we have just seen we can use the member function clear(), which takes no parameters.

get and put stream pointers

All i/o streams objects have, at least, one internal stream pointer:

ifstream, like istream, has a pointer known as the get pointer that points to the element to be read in the next input operation.

ofstream, like ostream, has a pointer known as the put pointer that points to the location where the next element has to be written.

Finally, fstream, inherits both, the get and the put pointers, from iostream (which is itself derived from both istream and ostream).

These internal stream pointers that point to the reading or writing locations within a stream can be manipulated using the following member functions:

tellg() and tellp()

These two member functions have no parameters and return a value of the member type pos_type, which is an integer data type representing the current position of the get stream pointer (in the case of tellg) or the put stream pointer (in the case of tellp).

seekg() and seekp()

These functions allow us to change the position of the get and put stream pointers. Both functions are overloaded with two different prototypes. The first prototype is:

seekg ( position );
seekp ( position );

Using this prototype the stream pointer is changed to the absolute position position (counting from the beginning of the file). The type for this parameter is the same as the one returned by functions tellg and tellp: the member type pos_type, which is an integer value.

The other prototype for these functions is:

seekg ( offset, direction );
seekp ( offset, direction );

Using this prototype, the position of the get or put pointer is set to an offset value relative to some specific point determined by the parameter direction. offset is of the member type off_type, which is also an integer type. And direction is of type seekdir, which is an enumerated type (enum) that determines the point from where offset is counted from, and that can take any of the following values:

ios::beg

offset counted from the beginning of the stream

ios::cur

offset counted from the current position of the stream pointer

ios::end

offset counted from the end of the stream

The following example uses the member functions we have just seen to obtain the size of a file:

      // obtaining file size
      #include <iostream>
      #include <fstream>
      using namespace std;
      
      int main () {
        long begin,end;
        ifstream myfile ("example.txt");
        begin = myfile.tellg();
        myfile.seekg (0, ios::end);
        end = myfile.tellg();
        myfile.close();
        cout << "size is: " << (end-begin) << " bytes.\n";
        return 0;
      }
Output
      size is: 40 bytes.

Binary files

In binary files, to input and output data with the extraction and insertion operators (<< and >>) and functions like getline is not efficient, since we do not need to format any data, and data may not use the separation codes used by text files to separate elements (like space, newline, etc...).

File streams include two member functions specifically designed to input and output binary data sequentially: write and read. The first one (write) is a member function of ostream inherited by ofstream. And read is a member function of istream that is inherited by ifstream. Objects of class fstream have both members. Their prototypes are:

write ( memory_block, size );
read ( memory_block, size );

Where memory_block is of type "pointer to char" (char*), and represents the address of an array of bytes where the read data elements are stored or from where the data elements to be written are taken. The size parameter is an integer value that specifies the number of characters to be read or written from/to the memory block.

      // reading a complete binary file
      #include <iostream>
      #include <fstream>
      using namespace std;
      
      ifstream::pos_type size;
      char * memblock;
      
      int main () {
        ifstream file ("example.txt", ios::in|ios::binary|ios::ate);
        if (file.is_open())
        {
          size = file.tellg();
          memblock = new char [size];
          file.seekg (0, ios::beg);
          file.read (memblock, size);
          file.close();
      
          cout << "the complete file content is in memory";
      
          delete[] memblock;
        }
        else cout << "Unable to open file";
        return 0;
      }
Output
      the complete file content is in memory

In this example the entire file is read and stored in a memory block. Let's examine how this is done:

First, the file is open with the ios::ate flag, which means that the get pointer will be positioned at the end of the file. This way, when we call to member tellg(), we will directly obtain the size of the file. Notice the type we have used to declare variable size:

      ifstream::pos_type size;

ifstream::pos_type is a specific type used for buffer and file positioning and is the type returned by file.tellg(). This type is defined as an integer type, therefore we can conduct on it the same operations we conduct on any other integer value, and can safely be converted to another integer type large enough to contain the size of the file. For a file with a size under 2GB we could use int:

      int size;
      size = (int) file.tellg();

Once we have obtained the size of the file, we request the allocation of a memory block large enough to hold the entire file:

      memblock = new char[size];

Right after that, we proceed to set the get pointer at the beginning of the file (remember that we opened the file with this pointer at the end), then read the entire file, and finally close it:

      file.seekg (0, ios::beg);
      file.read (memblock, size);
      file.close();

At this point we could operate with the data obtained from the file. Our program simply announces that the content of the file is in memory and then terminates.

Buffers and Synchronization

When we operate with file streams, these are associated to an internal buffer of type streambuf. This buffer is a memory block that acts as an intermediary between the stream and the physical file. For example, with an ofstream, each time the member function put (which writes a single character) is called, the character is not written directly to the physical file with which the stream is associated. Instead of that, the character is inserted in that stream's intermediate buffer.

When the buffer is flushed, all the data contained in it is written to the physical medium (if it is an output stream) or simply freed (if it is an input stream). This process is called synchronization and takes place under any of the following circumstances: